Reconstructing an Indo-European Family Tree from Non-native English Texts

نویسندگان

  • Ryo Nagata
  • Edward W. D. Whittaker
چکیده

Mother tongue interference is the phenomenon where linguistic systems of a mother tongue are transferred to another language. Although there has been plenty of work on mother tongue interference, very little is known about how strongly it is transferred to another language and about what relation there is across mother tongues. To address these questions, this paper explores and visualizes mother tongue interference preserved in English texts written by Indo-European language speakers. This paper further explores linguistic features that explain why certain relations are preserved in English writing, and which contribute to related tasks such as native language identification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Family Relationship Preserved in Non-native English

Mother tongue interference is the phenomenon where linguistic systems of a mother tongue are transferred to another language. Recently, Nagata and Whittaker (2013) have shown that language family relationship among mother tongues is preserved in English written by IndoEuropean language speakers because of mother tongue interference. At the same time, their findings further introduce the followi...

متن کامل

Word Etymology as Native Language Interference

We present experiments that show the influence of native language on lexical choice when producing text in another language – in this particular case English. We start from the premise that non-native English speakers will choose lexical items that are close to words in their native language. This leads us to an etymologybased representation of documents written by people whose mother tongue is...

متن کامل

Reconstructing the sounds of words from the past

We are developing novel statistical and signal processing methods to work backwards from contemporary audio recordings of simple words in modern Indo-European languages to regenerate audible spoken forms from earlier points in the language family tree. In this paper we present our first tentative steps in developing some of the necessary technical methods for realising this ambition, especially...

متن کامل

Aligning Parallel English-chinese Texts Statistically with Lexical Criteria

We describe our experience with automatic alignment of sentences in parallel English-Chinese texts. Our report concerns three related topics: (1) progress on the HKUST English-Chinese Parallel Bilingual Corpus; (2) experiments addressing the applicability of Gale & Church's (1991) length-based statistical method to the task of alignment involving a non-Indo-European language; and (3) an improve...

متن کامل

A Comparison of Phylogenetic Reconstruction Methods on an Indo-european Dataset

Researchers interested in the history of the Indo-European family of languages have used a variety of methods to estimate the phylogeny of the family, and have obtained widely differing results. In this paper we explore the reconstructions of the Indo-European phylogeny obtained by using the major phylogeny estimation procedures on an existing database of 336 characters (including lexical, phon...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013